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ABSTRACT 

This study investigated the robustness of the James 
second-order test (James 1951; Wilcox, 1989) and the univariate F 
test under a two-factor fixed-effect analysis of variance (ANOVA) 
model in which cell variances were heterogeneous and/or distributions 
were nonnormal. With computer-simulated data, Type I error rates and 
statistical power for the two tests were estimated. With data sampled 
from normal distributions, the F test was not robust to variance 
heterogeneity for equal or unequal sample sizes, but the James 
second-order test was robust in these situations. With normal 
distributions, equa^ variances, and equal sample sizes, the magnitude 
of power difference between the two tests was generally small when 
testing the main effects, but the magnitude of power difference 
between the two tests varied when testing the interaction effects. 
With data sampled from nonnormal distributions, although the James 
second-order test generally was liberal when the population 
distribution was skewed, the test was robust under several nonnormal 
distribution situations. Additionally, the robustness of the James 
second-order test in factorial designs may be affected by 
combinat i ons of nonnormal distributions , sampl e s izes , and variance 
patterns . (Contains 22 references and 7 tables .) (Author/SLD) 
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Abstract 

This study investigated the robustness of the James second-order test (James, 
1951; Wilcox, 1989) and the univariate F test under a two-factor fixed-effect ANOVA 
model where cell variances were heterogeneous and/or distributions were non-normal. 
Using. computer simulated data (SAS/IML [1989]), Type I error rates and statistical power 
for the two tests were estimated. With data sampled from normal distributions, the F test 
was not robust to variance heterogeneity for equal or unequal sample sizes, but the James 
second order test was robust in these situations. With normal distributions, equal 
variances, and equal sample sizes, the magnitude of power difference between the two 
tests was generally small when testing the main effects, but the magnitude of power 
difference between the two tests varied when testing the interaction effects. With data 
sampled from non-normal distributions, although the James second-order test generally 
was liberal when the population distribution was skewed, the current study showed that 
the test was robust under several non-normal distribution situations. Additionally, the 
robustness of the James second-order test in factorial designs may be affected by 
combinations of non-normal distributions, sample sizes, and variance patterns. (The F test 
was not examined under non-normal distributions because the F test does not provide a 
valid test for many heterogeneous variance situations.) 



ERLC 



3 



James and F Tests 
3 

A number of studies have investigated the robustness of omnibus tests when 
testing the equality of K means under variance heteroscedasticity and/or distribution non- 
normality. The univariate F-test, the Brown and Forsythe (1974) F*-test, the Welch (1951) 
test, and the James (1951) second-order test are the omnibus tests most frequently 
considered. Earlier studies that dealt with the validity of omnibus tests under variance . 
heterogeneity and/or distribution non-normality include Brown and Forsythe (1974), 
Clinch and Keselman (1982), Wilcox, Charlin, and Thompson (1986), and Oshima and 
Algina (1992a). These studies showed that neither the F-test nor the alternatives 
adequately control the Type I error rate under the nominal significance level when 
extreme violations of the variance equality and/or normality occur. 

Wilcox (1988) proposed a new alternative, H, which was computationally simpler 
than the James second-order test. Wilcox showed that although the H test has properties 
comparable to the James second-order test, it was slightly less powerful the James second- 
order test. Wilcox (1989) proposed a modification of the H-test, H m , which was shown to 
provide statistical power more comparable to the James second-order test. Oshima and 
Algina (1992a) pointed out that the Wilcox (1988) study focused on the effect of variance 
heterogeneity for both the James second-order test and the H test when sampling from 
normal distributions. Non-normality was studied but not in combination with variance 
heterogeneity. They argued that not considering the impact of the combined violations of 
variance homogeneity and distribution normality was an important omission. Their 
investigation of the robustness of the James second-order test and Wilcox H m test under 
heteroscedasticity and non-normality revealed that the empirical Type I error rates for 
both tests were affected when variance homogeneity and distribution normality both w r ere 
violated. They also indicated that the magnitude of difference between the empirical Type 
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1 error rate and nominal a level is positively related to the degree of asymmetry; the 
greater the degree of asymmetry, the greater difference between the empirical Type I 
error rate and nominal a. level. 

While most of the investigations into the robustness of ANOVA have concentrated 
on the one-factor design, Milligan, Wong, and Thompson (1987) investigated the 
robustness properties of nonorthogonal two-way fixed-effect ANOVA models. They 
concluded that each of the standard computational routines of ANOVA for unequal cell 
size was not robust to the assumptions of variance homogeneity or normality. When 
sample sizes were equal, however, they found that violating the homogeneity of variance 
assumption had little effect on the actual Type I error rate. Although they suggested four 
alternatives for dealing with unbalanced designs with variance heterogeneity or non- 
normal distributions, Keppel (1991, p. 283) stated that none of these alternatives is as 
effective as avoiding unequal sample sizes in the first place. Nonetheless, this alternative 
is often not an option in applied research where unbalanced designs are common. 

Wilcox (1989) generalized his H m test for situations invoking a factorial structure, 
the U test. After comparing the robustness properties of the U test and the James 
second-order test ui J er various heterogeneous variance conditions, Wilcox (1989) 
concluded that both che U test and the James second-order tests (a) performed well under 
null conditions and that they generally controlled the Type I error rate under the nominal 
a level: (b) provided sufficient power under non-null conditions; and (c) can be extended to 
higher-order designs. Hsiung, Olejnik, and Huberty (1994), however, showed that the U 
test does not adequately control the Type I error rate when the sample sizes are unequal 
and population means differ from zero. Therefore, Hsiung et. al concluded that the U test 
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is invalid for most practical situations and recommended the James second-order test for 
factorial designs. 

After conducting a meta-analysis on the robustness of ANOVA to variance 
heterogeneity, Harwell, Rubinstein, Hayes, and Olds (1992) concluded that there is an 
absence of well-documented omnibus tests that can be applied to two-factor fixed-effects 
.ANOVA cases. They advised that there is a need for an investigation into the robustness 
of available omnibus tests in two-factor ANOVA models. Responding to this call for 
further study of two-factor fixed-effect ANOVA models, the current investigation examines 
the robustness of the F-test and the James second-order test under heteroscedasticity 
and/or non-normality. Oshima and Algina (1992a) had shown that the James second-order 
test was affected by asymmetric distributions in a single factor design, but they only 
included two asyinmetrical non-normal distributions (i.e., the Beta and the Exponential 
distributions). Moreover, the Exponential non-normal distribution is not common in 
applied research. Fleishman (1978) indicated that the "typical" non-normal empirical 
distributions are with the degree of skew less than 0.8 and the magnitude of kurtosis 
between -0.6 and +0.6. The current study, therefore, examines robustness of the 
univariate F-test and the James second-order test in two factor designs with data sampled 
from more typical non-normal distributions. ' 

Method 

The present study included five two-factor fixed-effect ANOVA models: 2 x 2, 2 x 3, 
3 x 3, 3 x 4, and 4x4. Each model was studied under at least six conditions with each 
condition defined by sample sizes, population variances, and population distributions. Not 
all models were included for all conditions. Fifteen population distributions were 
considered; each population distribution was defined by the degrees of skew and kurtosis. 
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Twemy-six variance patterns were selected; each variance pattern consisted of different 
cell variances (Table 1 lists the characteristics of the 26 variance patterns). The sample 
sizes, variance patterns, magnitude of skew, and magnitude of kurtosis are reported in 
Tables 2 to 8 along with the results. 

Insert Table 1 About Here 

The present study used SAS/IML (SAS Inc, 1989) software to generate the data and 
compute the test statistics. Using the SAS-RANNOR function, scores for each cell were 
generated independently, Y yk ~ (/i jk , a jk 2 ). Each population mean equaled 0 under the null 
conditions and the cell_ u mean equaled 6 under the non-null conditions. Using the 
Fleishman (1978) transformation procedure, data were transformed to have a distribution 
with the target degrees of skew and kurtosis. For each condition, 10,000 replications were 
generated and the proportion of times the omnibus tests were rejected at the a = .05 level 
was recorded. A test was concluded liberal if its empirical Type I error rate exceeded 
.0544 (i.e.. greater than the two standard errors of the nominal significance level). 

For each replication, the data were analyzed by using the univariate F-test and the 
James second-order test. For the James second-order test formula refer to Wilcox [1989]; 
for the univariate F-test the unweighted means solution (regression approach) w r as used. 

Results 

Tables 2 and 3 present the results for the F-test and the James second-order test 
based on small (average cell size equals 5, Table 2) and large (average cell size equals 25, 
Table 3) sample sizes when sampling from normal population distributions. Balanced, 
slightly unbalanced, and extremely unbalanced designs were considered. Each table 
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includes three variance patterns with the coefficient of variance variation (Keselman & 
Rogan, 1978) ranging between 0 and 1.18. 



Insert Tables 2 and 3 About Here 

Results from Tables 2 and 3 reveal that, under heteroscedasticity, the F-test can 
have empirical Type I error rates greater than the nominal significance level even when 
sample sizes are equal. These results contradict Milligan, Wong, and Thompson (1987), 
who concluded that the F-test is valid under heterogenous variances when sample sizes 
are equal. The present results support Wilcox's (1987) cautionary note that, while equal 
sample sizes may reduce the effect of heterogeneous variance on the Type I error rate, the 
F-test may still be liberal if the degree of variance heterogeneous is great. In the present 
study, the small sample sizes variance ratic of 3:1 was sufficient to invalidate the 
univariate F-test. 

With unequal sample sizes and unequal variances, the F-test can be either 
conservative or liberal depending upon the relationship between the patterns of 
heterogeneity and the sample sizes. This relationship has been shown repeatedly in 
previous research on the effect of variance heterogeneity on ANOVA Type I error rates. It 
has been suggested that the effect of variance heterogeneity in unbalanced designs can be 
reduced if sample sizes are large (e.g., Maxwell and Delaney [1990, p. 110]). The results 
presented here support that belief to degree, but even with relatively large samples with 
extreme sample size inequality, the F-test had empirical Type I error rates less than the 
nominal significance level. These results support Wilcox's (1987) position that it is 



8 



James and F Tests 
8 

difficult to know how large a sample size is needed to reduce the effects of unequal 
variances. 

The James second-order test had the Type I error rates that ranged between .0456 
and .0530 when sample sizes were large and ranged between .0420 and .0544 when sample 
sizes were small across both balanced and unbalanced designs. These results support the 
conclusion that the James second-order test is robust to variance heterogeneity for equal 
or unequal sample sizes when the population distributions are normal. 

Table 4 presents the empirical power estimates for the univariate F-test and the 
James second-order test when sampling from normal population distributions with equal 
variance and equal sample sizes. Results show that for many of the hypotheses tests, the 
James second-order test is only slightly less powerful than the univariate F-test. The 
power difference between the two tests is in the range of magnitude from .000 to .052 
when testing main effects and is in the range of magnitude from .000 to .202 when testing 
interaction effects. 



Insert Table 4 About Here 

Results show that when testing the main effects, the magnitude of power difference 
between the two tests was generally small. However, when testing the interaction effects, 
the magnitude of power difference between the two tests varied. The magnitude of the 
power difference depends on the number of interaction contrasts that are conducted. 
Wilcox (1989) suggested using the Bonferroni procedure to adjust the nominal a level for 
each contrast (i.e., a' = a / [min(J, K) - 1] ). This approach reduces the statistical power 
of the James second-order test if the minimum of (J, K) ;> 3. Using the Holland- 
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Copenhaver (1987) enhancement to the Bonferroni procedure would likely reduce the 
power difference between the F-test and the James second-order test. 

Results for the James second-order test when data were sampled from non-normal 
distributions are reported in Tables 5 through 7. Univariate F-test results are not 
included in these tables since, as shown previously, the F-test does not provide a valid test 
for many situations where variances are heterogeneous. 

Table 5 presents the results for the James second-order test for the two by four 
fixed-effects ANOVA model. A total of 72 conditions were considered; each condition was 
defined by the sample size (design type), distribution type, and variance pattern, Three 
distributions were considered. They were (a) normal distribution, (b) positively skewed- 
leptokurtic non-normal distribution (skew = 1.75 and kurtosis = 3.75), and (c) platykurtic 
non-normal distribution (skew = 0, kurtosis = -1.0). 

Consistent with Tables 2 and 3, the James second-order test was valid when the 
assumption of normality was m^u But when data were sampled from a population 
distribution that was skewed, the James second-order test frequently had Type I error 
rates greater than the nominal significance level. With the same non-normal distribution, 
the test was more liberal when sample sizes were extremely unequal than when sample 
sizes were equal or slightly unequal. 

When data were sampled from the platykurtic non-normal distribution, the James 
second-order test was robust when sample sizes were equal or slightly unequal, but 
appeared to be liberal when sample sizes were extremely unequal. 

Although the James second-order test may be liberal when the assumption of 
normality is violated, the current results show that the test is robust under several non- 
normal distribution situations. It appears that in a factorial design, the robustness of the 
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James second-order test maybe affected by combinations of non-normal distributions, 
sample sizes, and variance patterns. 

Table 6 presents the results for the James second-order test under a four by four 
fixed-effect balanced factorial design. A total of 72 conditions were included with each 
condition being defined by the sample size, variance pattern, degree of skew, and degree of 
kurtosis. 



Insert Table 6 About Here 

When the assumption of variance homogeneity was met, but normality was violated, 
the James second-order test had empirical Type I error rates that did not exceed two 
standard errors above the nominal significance level. However, the James second-order 
test was conservative when sample sizes were small - empirical Type I error rates were 
generally less than two standard errors below the nominal significance level. 

The James second-order test appeared to be liberal when the degree of skew was 
equal to or greater than 1.0. Yet, the patterns of sample size, variance, and degree of 
kurtosis also had some effect on the robustness of the test. 

With the same degree of skew, the current results show that the James second- 
order test had greater Type I error rates when the degree of kurtosis was small than when 
the degree of kurtosis was large. 

Table 7 presents the results for the James second-order test under two by four 
balanced and unbalanced fixed-effect designs. Data were sampled from 12 non-normal 
distributions; each distribution was defined by the degrees of skew and kurtosis. 
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Insert Table 7 About Here 

When the variances were equal and the degree of skew was less than 1.50, the 
James second-order test generally controlled the Type I error rate under the nominal 
criterion a level. But the James second-order test appeared to be liberal when the degree 
of skew was equal to 1.50. With the same degree of skew, the test generally had greater 
Type I error rate when the degree of kurtosis was small than when the degree of kurtosis 
was large. Finally, it appears that a balanced design might reduce the effect of skewed 
distributions somewhat. However, as demonstrated in Tables 6 and 7, a balanced design 
cannot be relied on to provide a valid test when distributions are skewed. 

Conclusions 

Contrary to what some believe (Milligan, Wong, & Thompson, 1987), the univariate 
F-test for a factorial design is not robust to the violation of the equal variance assumption 
when sample sizes are equal. The present study shows that the actual Type I error rate 
for the F-tesl can exceed the nominal significance level when sample sizes are equal but 
cell variances differ by as small as a 3 to 1 ratio. The James second-order test, on the 
other hand, control the actual risk of a Type I error under the nominal significance level 
(a .05) when sampled populations have normal distributions. Further, the study 
provides some evidence indicating that when all parametric assumptions are met, the 
James second-order test provides statistical power comparable to the univariate F-test at 
least for hypotheses on main effects. Considerably lower power might be obtained for the 
interaction test depending on the dimensions of the factorial structure. The lower power 
can be attributed to the use, in the present study, of the Bonferroni adjustment for 
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multiple hypothesis tests. If one of the enhancements to the Bonferroni method was 
used, the power difference between the univariate test and the James second-order test, 
however, would be reduced. In addition, if an omnibus test is of interest and it is 
reasonable to assume normal population distributions, the F-test should be abandoned in 
favor of the James second-order test. 

Micceri (1989) reported that sampling from normal population distributions may be 
the exception rather than the rule in educational research. With both skewed-leptokurtic 
and platykurtic distributions, the James second-order test may not adequately control the 
risk of a Type I error to the nominal significance level. The degree of non-normality, 
variance heterogeneity, and the inequality of sample sizes all can affect the actual risk of a 
Type I error rate. The results of the present study did not make clear the exact 
relationship among these three factors. However, it did appear that having equal sample 
sizes can reduce the effect of non-normal distributions and heterogeneous variances on the 
Type I error rate. 

Finally, Keppel (1991, p. 105) indicated that the James second-order test, the 
currently favored procedure, is "simply too complicated for general use." Recently, Oshima 
and Algina (1992b) developed a SAS/IML program for one-factor designs and Hsiung, 
Olejnik, and Oshima (1994) developed a SAS/IML program for two-factor fixed-effect 
designs. Lix and Keselman (1994) have developed a more general program to compute 
approximate degrees of freedom tests for both univariate and multivariate omnibus tests 
as well as tests for contrasts. With these computer programs available for application of 
alternative tests to the univariate jF-test, the "disadvantage" of being computational ' 
intense should not be a limitation. 
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Table 1. 

Summary Table for the Characteristics Variance Patterns Considered. 



Variance Pattern 


Pattern 


Within Row 


Cross Average Row 


Within Column 


Cross Average Column 


1 


Equal 


Equal 


Equal 


Equal 


2 3 


Equal 


Unequal 


Unequal 


Equal 




Unequal 


Equal 


Equal 


Unequal 


4 C 


Unequal 


Equal 


Unequal 


Equal 


5 d 


Unequal 


Equal 


Unequal 


Unequal 


6 C 


Unequal 


Unequal 


Unequal 


Equal 


7 f 


Unequal 


Unequal 


Unequal 


Unequal 



Note. Tor examples, Tables 2 and 3: all unequal variance patterns, Tables 5 and 8: 

(1. 1, 1, 1 ; 9, 9, 9, 9), and Table 6: (1, 1, 1, 1 ; 4, 4, 4, 4 ; 16, 16, 16, 16). 
b For examples, Table 5: (16, 9, 4, 1 ; 16, 9, 4, 1) and 

Tables 6 and 7: (1, 4, 9, 16 ; 1, 4, 9, 16 ; 1, 4, 9, 16). 
Tor examples, Table 5: (4, 4, 1, 1 ; 1, 1, 4, 4) and 

Table 7: (1. 16. 9. 4 : 4, 1, 16, 9 : 9, 4, 1, 16 ; 16, 9, 4. 1). 
Tor example, Tables 5 and 8: (16, 9, 4, 1 ; 1, 4, 9, 16). 
Tor example, Table 5: (16, 14, 12, 10 ; 2, 4, 6, 8). 
Tor examples. Table 5: (16, 14, 12, 10 ; 8, 6, 4, 2) and 

Table 6: (1, 4, 9, 16 ; 16, 13, 8, 1 ; 4, 9, 16, 1). 
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Table 2 

Type I Error Rate for the Univariate F test and the James Second-Order Test in 
Balanced/Unbalanced Two by Two, Two by Three, or Three by Three Fixed-Effect Factorial 
Designs with Small Sample Sizes 



F test 
Type of Design 



James Test 
TyP c of Design 



Sample Size 


Variance 


Factor 


BA 8 


SU fa 


EU C 


BA 8 


SU b 


EU C 


2x2 


1 , 1 
1 , 1 




.0479 


.0552 


.0499 


.0440 


.0453 


.0555 


*BA: Balanced Design 
(5, 5 ; 5, 5) 


B Co! 

AxB 


.0479 
.0471 


.0534 
.0575 


.0481 
.0501 


.0445 
.0439 


.0460 
.0476 


.0544 
.0529 


b SU: Slightly Unbalanced 
(6, 4 ; 4, 6) 


3 , 1 

6 , 1 


Arow 


.0518 


. .0552 


.0136 


.0458 


.0481 


.0531 


£. u. JOAiruniL;i\ unDaianceQ 
(S, 2 ;*7, 3) 


B CoI 

AxB 


.0531 
.0545 


.0534 
.0575 


.0142 
.0112 


.0465 
.0476 


.0473 
.0487 


.0496 
.0499 




16. 1 
16, 1 


Arow 
B Co! 


.0628 
.0572 


.0629 
.0601 


.0045 
.0029 


.0493 
.0421 


.0479 
.0445 


.0481 
.0480 






AxB 


.0626 


.0655 


.0047 


.0453 


.0493 


.0496 


2x3 


1, 1, 1 
1, 1, 1 


^R(W 


.0506 


.0499 


.0505 


.0472 


.0481 


.0538 


a BA: Balanced Design 
(5, 5, 5 ; 5, 5 ,5) 


B Col 

AxB 


.0486 
.0479 


.0498 
.0523 


.05S1 
.0557 


.0426 
.0423 


.0428 
.0448 


.0532 
.0537 


b SU: Slightly Unbalanced 
(6, 5, 4 ; 7, 5, 3) 

C EU: Extremelv Unbalanced 
(8, 5, 2 f 9, 4, 2) 


3, 1, 1 

3 1 1 


Arow 
B Col 

AxB 


.0507 
.0572 
.0607 


.0277 
.0278 
.0258 


.0125 
.010S 
.0092 


.0450 
.0422 
.0445 


.0451 
.0449 
.0452 


.0463 
.0483 
.0446 




16, 1, 1 
16, 1. 1 


Arc*- 

B Ccl 


.062S 
.0869 


.0193 
.0259 


.0014 
,0019 


.0467 
.0437 


.0495 
.0416 


.0440 
.0431 






AxB 


.0850 


.0276 


.0020 


.0430 


.0434 


.0450 


3x3 


1, 1, 1 
1, 1, 1 
1, 1. 1 


Arow 


.0445 


.0488 


.0464 


.0395 


.0421 


.0407 


*BA: Balanced Design 
(5. 5, 5; 5, 5, 5; 5, 5 ,5) 


B Col 

AxB 


.0473 
.0496 


.0517 
.0503 


.0558 
.0532 


.0436 
.0377 


.0442 
.0390 


.0420 
.0514 


b SU: Slightly Unbalanced 
(6, 5, 4; 7, 5, 3; 7, 4, 4) 

r EU: Extremelv Unbalanced 
(8, 5. 2; 9, i f 2; 8, 4, 3) 


3, 1, 1 
3, 1. 1 
3, 1, 1 


Aro*' 
B Ccl 

AxB 


.0524 
;0578 
.0606 


.0226 
.0240 
.0212 


.0091 
.0092 
.0061 


.0449 
.0450 
.0365 


.0439 
.0392 
.0370 


.0408 
.0423 
.0484 




16. 1. 1 
16, 1. 1 
16, 1, 1 


Alio*. 
B Col 


.0665 
.0845 


.0104 
.0175 


.0005 
.0013 


.0494 
.0451 


.0470 
.0427 


.0453 
.0370 






AxB 


.0966 


.0151 


.0009 


.0427 


.0379 


.0449 



Note. Data were sampled 
greater than the criterion 



from normal distributions, 
.0544 and the test has an 



Shading indicates that the value is 
inflated Type I error rate. 
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Table 3 

Type I Error Rate for the Univariate F test and the James Second-Order Test in 



Balanced/Unbalanced Two by Tivo, Two by Three, or Three by Three Fixed-Effect Factorial 
Designs with Large Sample Sizes 











FTest 






James Test 










Type of Design 


Type of Design 




Variance 


Factor 


BA* 


SU b 


EU C 


BA 6 


SU b 


EU C 


2x2 


1 , 1 


Arow 


.0517 


.0516 


.0497 


.0514 


.0518 


.0484 


"BA: Balanced Design 
(25, 25 ; 25, 25) 


1 , 1 


B C oi 
AxB 


.0528 
.0463 


.0527 
.0519 


.0522 
.0495 


.0526 
.0462 


.0525 
.0515 


.0520 
.0512 


b SU; S'.igntly Unbalanced 
(2/, 23 ; 26, 24) 


3,1 
3 , 1 




.0463 


.0452 


.0132 


.0456 


.0518 


.0479 


C EU: Extremely Unbalanced 
(35, 15 ; 37, 13) 


B Col 

AxB 


.0511 
.0529 


.0429 
.0448 


.0150 
.0152 


.0505 
.0518 


.0485 
.0501 


.0475 
.0536 




it? i 
16, 1 


A 

A Row 
B Col 


.0520 
.0526 


.0410 
.0428 


.0029 
.0039 


.0490 
.0497 


.0499 
.0528 


.0481 
.0529 






AxR 


.0539 


.0394 


.0024 


.0514 


.0483 


.0488 


2x3 


1, 1, 1 
1, 1, 1 


Ar<jw 


.0459 


.0497 


.0531 


.0458 


.0498 


.0532 


a BA: Balanced Design 
(25, 25, 25 ; 25, 25 ,25) 


B c 0 ; 
AxB 


.0510 
.0520 


.0506 
.0485 


.0489 
.0513 


.0500 
.0525 


.0512 
.0492 


.0484 
.0489 


b 5U: Siightlv Unbalanced 
(2S, 25, 23' ; 27, 26, 22) 

°EU: Extremely Unbalanced 
(32 t 25, 18 ; 30, 26, 17) 


3, 1, 1 
3, 1, 1 


AxB 


.0476 
.0544 
.0554 


.0459 
.0503 
.0416 


.0293 
.0275 
.0292 


.0466 
.0466 
.0485 


.0527 
.0527 
.0459 


.0511 
.0511 
.048S 




16, 1, 1 
16, 1, 1 


Arov 
B Col 


.0528 
.0760 


.0385 
.0555 


.0165 
.0287 


.0500 
.0491 


.0527 
.0496 


.0485 
.0476 






AxB 


.0824 


.0563 


.0299 


.0525 


.0470 


.0524 


3 x 3 


1, 1, 1 
1, 1, 1 
1, 1, 1 




.0492 


.0548 


.0505 


.0488 


.0533 


.0480 


a BA: Balanced Design 
20, 20, 20 
20. 20, 20 


B Col 

AxB 


.0515 
.0505 


.0467 
.0521 


.0526 
.0562 


.0495 
.0491 


.0468 
.0477 


.0492 
.0460 


20 ? 20, 20 

b SU: Slightly Unbalanced 
23, 19, IS 

21, 22, 17 

22, 19, 19 

C EU: Extremely Unbalanced 
32, 16, 12 

29, 20, 11 

30, 21, 9 


3, 1, 1 
3, 1, 1 
3, 1, 1 


B Col 

AxB 


.0506 
.0543 
.0590 


.0418 
.0451 
.0426 


.0126 
.0131 
.0081 


.0487 
.0491 
.0481 


.0515 
.0526 
.0462 


.0487 
.0476 
.0487 


16, 1, 1 
16, 1, 1 
16, 1, 1 


^■Rov 
B Col 

AxB 


.0555 
.0756 
.0857 


.0321 
.0546 
.0560 


.0012 
.0049 
.0022 


.0536 
.0483 
.0478 


.0498 
.0484 
.0453 


.0524 
.0500 
.0474 



Note. Data were sampled from normal distributions. Shading indicates that the value is 
greater than the criterion .0544 and the test has an inflated Type I error rate. 
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Table 4 

Statistical Power for the Univariate F test and the James Second-Order Test in Balanced 
Two by Two, Two by Three, or Three by Three Fixed-Effect Factorial Designs 



Sample Size 


Factor 


F test 


James Test 


Power Difference 








(F test - James Test) 


5, 5 


^Rtw 


.7458 . 


.7314 


.0144 


5, 5 






7^1 7 


f)1 9 1 




AxB 


.7483 


.7347 


.0136 


o, 5. 0 




.5979 


.5S55 


.0124 


5, 5. 5 












7779 


79 ' ^ 


0^97 

.VOL « 




AxB 


.7742 


.7250 


.0492 


5, 5, 5 




.6136 


.5723 


.0413 


5, 5. 5 
o, 5, 5 






. <J f Uu 






AxB 


.8167 


.6146 


.2021 


25, 25 




.5924 


.5919 


.0005 


25, 25 










B Col 


.6096 


.6091 


.0005 




AxB 


.5997 


.5992 


.0005 


25, 25, 25 




.4492 


.4489 


.0003 


25, 25, 25 








B Co! 


.6292 


.6245 


.0047 




AxB 


.6214 


.61S7 


.0027 


20. 20. 20 


Arov 


.3768 


.3720 


.0048 


20, 20, 20 








20. 20. 20 


B Col 


.3739 


.3671 


.0068 




AxB 


.5254 


.4078 


.1176 



Xote. Data were sampled from normal distributions. The true group mean difference was 
created by adding a constant to each observation of the first cell (i.e., Cell_ n ); the constant 
6 was set equal to 2.5 for n k = 5 and was set equal to 0.9 for n jk = 25. 
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Table 5 

Type I Error Rate for the James Second-Order Test in Balanced/Unbalanced Two by Four 
Fixed-Effect Factorial Designs with Normal/Non-Normal Distributions and 
Homogeneous/Heterogeneous Variances 

Balanced Design Slightly Unbalanced Extremely Unbalanced 
15, 15, 15, 15 18, 16, 14, 12 22, 18, 12, 8 
15. 15. 15. 15 17. 16. 14. 13 24, 20. 10, 6 



Distribution Distribution Distribution 



Variance 
Pattern 


Factor 


Normal 


Skew 8 


Platy b 


Normal 


Skew 


Platv 


Normal 


Skew 


Platv 


1,1.1,1 
1.1,1,1 


B Co! 


.0471 
.0515 


.0473 
.0529 


.0501 
.0545 


.0500 
.0480 


.0458 
.0559 


.0497 
.0517 


.0447 
.0492 


.0447 
.0657 


.0489 
.0533 




AxB 


.0502 


.0450 


.0518 


.0482 


.0381 


.0498 


.0482 


.0378 


.0516 


1,1,9,9 
1,1,9,9 


B Col 


.0510 
.0466 


.0461 
.0738 


.0508 
.0514 


.0481 
.0465 


.0455 
.0741 


.0518 
.0503 


.0492 
.0472 


.0460 
.0913 


.0503 
.0523 




AxB 


.0497 


.0392 


.0522 


.0482 


.0432 


.0492 


.0486 


.0328 


.0539 


4.4.1,1 
1,1,4^4 




.0511 


.0493 


.0507 


.0512 


.0488 


.0498 


.0517 


.0530 


.0551 


B Co) 


.0499 


.0564 


.0504 


.0510 


.0563 


.0493 


.0494 


.0709 


.0577 




AxB 


.0488 


.0679 


.0521 


.0490 


.0684 


.0515 


.0521 


.0761 


.0584 


16,9.4.1 
16.9 : 4,1 


B Col 


.0495 
.0459 


.0528 
.0765 


.0500 
.0545 


.0523 
.0463 


.0486 
.0689 


.0505 
.0486 


.0505 
.0499 


.0518 
.0632 


.0483 
.0518 




AxB 


.0476 


.0415 


.0.'04 


.0474 


.0403 


.0491 


.0500 


.0411 


.0496 


16. 9. 4. 1 
1, 4, 9, 16 


B co: 


.0490 
.0492 


.0499 
.0579 


.0508 
.0493 


.0484 
.0534 


.0491 
.0569 


.0522 
.0549 


.0496 
.0496 


.0559 
.0819 


.0576 
.0597 




AxB 


.0506 


.0776 


.0515 


.0504 


.0747 


.0543 


.0497 


.1002 


.0609 


16.14.12,10 
2, 4, 6, S 


A Row 
B Col 


.0512 
.0486 


.0498 
.0586 


.0508 
.0526 


.0508 
.0490 


.0545 
.0565 


.0512 
.0507 


.0467 
.0524 


.0439 
.0654 


.0507 
.0500 




AxB 


.0484 


.0525 


.0528 


.0497 


.0455 


.0517 


.0501 


.0406 


.0552 


16,14,12,10 
8, 6, 4, 2 


Bcoi 


.0479 
.0502 


.0533 
.0567 


.0502 
.0538 


.0481 


.0520 
.0553 


.0522 
.0578 


.0521 
.0503 


.0489 
.0620 


.0486 
.0544 




AxB 


.0480 


.0438 


.0536 


.0525 


.0491 


.0516 


.0487 


.0453 


.0503 


1,1.1.1 
9,9,9,9 


A 
B 


.0493 
.0504 


.0567 
.0612 


.0509 
.0546 


.0491 
.0463 


.0521 
.0543 


.0518 
.0514 


.0526 
.0501 


.0635 
.0782 


.0548 
.0608 




AxB 


.0471 


.0528 


.0577 


.0464 


.0521 


.0495 


.0498 


.0704 


.0630 



Note. Shading indicates that the value is greater than the criterion .0544; the test has an 
inflated Type I error rate. 

"Skew; Skewed-leptokurtic non-normal distribution (skew = 1.75, kurtosis = 3.75). 
b Platy: Platykurtic non-normal distribution (skew = 0, kurtosis = -1.0). 
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